Keystone: An Open Framework for Architecting 
Trusted Execution Environments 


Dayeol Lee David Kohlbrenner Shweta Shinde 
dayeol@berkeley.edu dkohlbre@berkeley.edu shwetas@berkeley.edu 
UC Berkeley UC Berkeley UC Berkeley 

Krste Asanović Dawn Song 
krste@berkeley.edu dawnsong@berkeley.edu 
UC Berkeley UC Berkeley 


Abstract 


Trusted execution environments (TEEs) see rising use in 
devices from embedded sensors to cloud servers and en- 
compass a range of cost, power constraints, and security 
threat model choices. On the other hand, each of the current 
vendor-specific TEEs makes a fixed set of trade-offs with 
little room for customization. We present Keystone—the first 
open-source amewa] for sla Sioned _ ern 
stone uses simple tr 1 vic y 


We iwase ioe Keystone: -based TEEs run on untnodified 
RISC-V hardware and demonstrate the strengths of our de- 
sign in terms of security, TCB size, execution of a range of 
benchmarks, applications, kernels, and deployment models. 


CCS Concepts: « Security and privacy — Trusted com- 
puting; Hardware security implementation; Software and ap- 
plication security. 
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1 Introduction 


The last decade has seen the proliferation of trusted exe- 
cution environments (TEEs) to protect sensitive code and 
data. All major CPU vendors have rolled out their TEEs 
(e.g., ARM TrustZone, Intel SGX, and AMD SEN) to provide 
secure execution environment, c nonly ret | to as 
Cn. [1, 51, 64]. TEEs nae use-cases in diverse de- 
ployment environments ranging from cloud servers, mobile 
phones, ISPs, IoT devices, sensors, and hardware tokens. 
Unfortunately, each vendor TEE enables only a small por- 
tion of the possible design space across threat models, hard- 
ware requirements, resource management, porting effort, 
and feature compatibility. When a cloud provider or soft- 
ware sa chooses a target hardware platform they are 
sign limitations regardless 
of their actual application needs. Constraints breed creativity, 
giving rise to significant research effort in working around 
these limits. For example, Intel SGXv1 [64] requires statically 
sized enclaves, lacks secure I/O and syscall support, and is 
vulnerable to significant side-channels [35]. Thus, to exe- 
cute arbitrary applications, the systems built on SGXv1 have 
inflated the Trusted Computing Base (TCB) and are forced 
to implement complex workarounds [18, 22, 31]. As only 
Intel can make changes to the inherent design trade-offs in 
SGX, users had to wait for changes like dynamic resizing of 
enclave virtual memory in SGXv2 [63]. Unsurprisingly, these 
and other similar restriction have led to a proliferation of new 
TEEs on other ISAs (e.g., OpenSPARC [30], RISC-V [36, 80]). 
However, each such redesign requires considerable effort 
and only provides another fixed deste n point. 
We dv orate that the hardware shoul vide 


able TEES. We can draw an analogy with the move ee radi 
tional networking solutions to Software Defined Networking 
(SDN), where exposing the packet forwarding primitives to 
the software has led to far more novel designs and research. 
Such a pai shift in TEEs will pave the way for 

se c nization. It will allow the features and the 
eo med to be tuned for each hardware plat 
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use-case from a set of common software components, draw- 
ing on ideas from modular kernel concepts [40, 60, 78]. This 
motivates the need for Customizable TEEs—an abstraction 
that allows entities that create the hardware, operate it, and 
develop applications to configure and deploy various TEE 
designs from the same base. Customizable TEEs promise in- 


dependent exploration of gaps/trade-offs in existing designs, 
quick prototyping of new feature requirements, a shorter _ 
turn-around time for fixes, adaptation to threat models, and 
usage-specific deployment. 

For realizing this vision, our first observation is the need 
for a highly programm averant haa Sl GSE 
S. Second, we must d 


GNIEN: note that SPOONER: 
in a trusted layer with a mix of security and virtualization re- 
sponsibilities, thus complicating the most critical component. 
Similarly, firmware and micro-code are not programmable to 
a degree that satisfies our requirements. These two require- 


ments ensure we avoid the mistake of using hardware with _ 


a separation mechanism encumbered with a static bound- 
ary between what is trusted and untrusted. Lastly, we draw 
inspiration from proliferation of commercial (c.f. Intel SGX, 
TrustZone) and non-commercial TEEs (c.f. Sanctum [36], 
Komodo [41]) which demonstrate the need for a common, 
portable software base adaptable to ever-changing hardware 
capabilities and use-case demands. 

To this end, we propose Keystone—the first open-source 
framework for building customized TEEs. We built Keystone 
on unmodified RISC-V using its standard specifications [17] 
for physical memory protection (PMP)—a primitive which 
allows the programmable machine mode underneath the OS 
in RISC-V to specify arbitrary protections on physical mem- 
ory regions. We use this machine mode to execute a trusted _ 
security monitor (SM) to provide security boundaries without 
needing to perform any resource management. Critically, 
region and has its own supervisor-mode runtime (RT) compo- 
nent to manage the virtual memory of the enclave and more. 
With this novel design, any enclave-specific functionality 
quae cd cleanly ba 
hardware-enforced guarantees. An enclave’s RT implements 
only the required functionality, communicates with the SM, 
mediates communication with the host via shared memory, 
and services the enclave user-mode application (eapp). 

Our choice of RISC-V and the logical separation between 
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self-paging, and more inside the enclave. For strengthenin, 
The security, our Qa aye 


s. We 
demonstrate the potential of this with a highly configurable 
cache controller to, in concert with PMP, transparently de- 
fend against physical adversaries and cache side-channels. 

We built Keystone, the SM, two RTs (our native RT—Eyrie— 
and an off-the-shelf microkernel seL4 [55]), and several mod- 
ules which together allow enclave-bound user applications 
to selectively configure and use the above features (Figure 1). 
We extensively benchmark Keystone on 4 suites with varying 
workloads: RV8, IOZone, CoreMark, and Beebs. We show- 
case use-case studies where Keystone can be used for secure 
machine learning (Torch and FANN frameworks) and crypto- 
graphic tasks (libsodium) on embedded devices and cloud 
servers. Lastly, we test Keystone on different RISC-V sys- 
tems: the HiFive Freedom Unleashed, 3 in-order cores and 
1 out-of-order core via FPGA, and a QEMU emulation—all 
without modification. Keystone is fully open-source. 


Contributions. We make the following contributions: 


e Customizable TEEs. We define a new paradigm wherein 
the hardware manufacturer, hardware operator, and 
the enclave programmer can tailor the TEE design. 
Keystone Framework. We present the first framework to 
configure, build, and instantiate customized TEEs. Our 
principled way of ensuring modularity in Keystone 
allows us to customize the design dimensions of TEE 
instances as per the requirements. 

e Open-source Implementation. We demonstrate advan- 
tages of different Keystone TEE configurations that 
are tailored for minimizing the TCB, adapting to threat 
models, using hardware features, handling workloads, 
or providing rich functionality without any micro- 
architectural changes. A typical Keystone instantiated 
TEE design adds a total TCB of 12-15 K lines of code 
(LoC) to an enclave-bound application, of which the 
SM consists of only 1.6 KLoC added by Keystone. 
Benchmarking & Real-world Applications. We evaluate 
Keystone on 4 benchmarks: CoreMark, Beebs, and RV8 
(< 1% overhead), and IOZone (40%). We demonstrate 
real-world machine learning workloads with Torch in 
Eyrie (7.35%), FANN (0.36%) with seL4, and a Keystone- 
native secure remote computation application. Finally, 
we demonstrate defenses against physical adversaries 
with memory encryption and cache side-channels. 


2 A Common Base for Diverse TEEs 
2.1 Background: Commercial TEEs 


ity. Specifically, Keystone’s SM uses hardware primitivesto Current widely-used TEE systems cater to specific and valu- 


able use-cases but occupy only a small part of the wide design 
space (see Appendix A). Consider the case of a heavy server 
workload (databases, ML inference, etc.) running in an un- 
trusted cloud environment. One option is an Intel SGX-based 


Keystone 
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Figure 1. Keystone system with host processes, untrusted OS, se- 
curity monitor, and multiple enclaves (each with runtime and eapp) 


solution which has a large software stack [18, 22, 31] to ex- 
tend the supported features. On the other hand, an AMD 
SEV-based solution isolates a full VM with a large TCB. If 
one wants additional defenses against side-channels it adds 
further user-space software mechanisms for both cases. If 
we consider edge-sensors or IoT applications, the available 
solutions are TrustZone based. While more flexible than SGX 
or SEV, TrustZone supports only a single hardware-enforced 
isolated domain called the Secure World. Any further iso- 
lation needs multiplexing between secure applications via 
software-based Secure World OS solutions [12]. Thus, irre- 
spective of the TFE, developers often compromise their re- 
quirements (e.g., resort to a large TCB solution, one isolation 
domain) or build their custom design. One such emerging 
direction is to use od software, s 


These designs 


Several proposals in this area 
have demonstrated the e feasibility of this approach. Sanc- 

ses a series of modifications to hardware to construct 
user-space enclaves for RISC-V. Komodo takes this concept 
further and provides a verified monitor that executes on top 
of ARM’s TrustZone. While these systems inherit the limi- 
tations of their underlying designs (e.g., hardware changes 
or only two security domains in TrustZone), monitor-based 
TEEs are a very promising direction. 


2.2 Customizable TEEs 


We call our model customizable TEEs. It uses a common 
software framework to assemble a specialized TEE specific 
to the use-case with multiple stakeholders’ inputs. anes 


tives. Realizing a specific TEE instance involves the platform 
provider’s choice of the hardware interface, the trust model, 
and the enclave programmer s tentures requirements. The — 


es offload their choic amework that composes 
the required madiles to instantiate a specialized TEE. 

A motivation for customizable TEEs is that the threat 
model may differ depending on the use case, the application 
or the hardware platform. Even on the same platform with 
the same SM, different applications may operate under dif- 


fering threat models. For this reason, we allow each enclave 
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fiy its co U es. Consider 
a spel IoT sensor sf pletion that signs measurements for 
authenticity guarantees and an adversary using a cache oc- 
cupancy side-channel. In this case, the sensor driver must 
be protected and requires runtime memory integrity, but 
not memory confidentiality. The signing process requires 
both memory integrity and confidentiality. Thus, a possible 
configuration would be to have the cryptographic library op- 
erate with a private cache partition enclave while the driver 
may operate in a basic isolated enclave. An appropriate SM 
mechanism (e.g. mailboxes) can ensure authenticated com- 
munication between these two enclaves. An adversary using 
a cache occupancy side-channel against the driver learns 
only the public measurements, and cannot learn anything 
about the cryptographic library. By allowing each enclave 
to specify and deploy its own defenses, we can optimize our 
use of the available resources (in this case, limited private 
cache space) and expensive security mechanisms. 

The existing commercial TEE systems offer inflexible threat 
models linked to the respective hardware platform. Notably, 
Intel’s SGX [64] does not support any configuration of its 
memory protection systems as would be desirable for use 
cases not requiring expensive memory encryption. On the 
other hand, while offering some software and hardware cus- 
tomization, ARM’s TrustZone provides an inferior substrate 
to build a modular TEE. Core to TrustZone’s design is the 
concept of only two security domains. A TrustZone TEE im- 
plementing multiple enclaves must use the memory manage- 
ment unit (MMU) for further isolation. This fundamentally 
limits what operations enclaves can be allowed to perform 
and limits enclaves to user-mode. This limitation naturally 
extends to all TEE systems built using TrustZone as a base 
like Komodo. On the hardware side, TrustZone relies on 
system-wide bus-address filters (e.g., the TZC-400) to sepa- 
rate secure from insecure DRAM partitions, whereas RISC-V 


provides per-hardware-thread views of physical memory via 
machine-mode and PMP registers, Using RISCLV thus allows 


trusted boot process, a hardware source of randomness, and a 
trusted boot process. Key provisioning [15] is an orthogonal 
problem. For this paper, we assume a simple manufacturer 
provisioned key. 


2.3 Entities in TEE Lifecycle 


We define five logical entities in customizable TEEs: 
Hardware manufacturer designs and fabricates RISC-V 
hardware including relevant IP for trusted boot. 
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Keystone platform provider purchases manufactured hard- 
ware; operates the hardware; makes it available for use to 
its customers; configures the SM. 

Keystone programmer develops Keystone software com- 
ponents including SM, RT, and eapps; we refer to the respec- 
tive programmers who develop these specific components. 
Keystone user chooses a Keystone configuration of RT and 
an eapp. They instantiate an enclave which can execute on 
hardware provisioned by the Keystone platform provider. 
Eapp user interacts with the eapp executing in an enclave 
on the TEE instantiated using Keystone. 

In real-world deployments, a single entity can perform mul- 
tiple roles. For example, consider Acme Corp. hosts their 
website on an Apache webserver executing on Bar Corp. 
manufactured hardware in a Keystone-based enclave hosted 
on Cloud Corp. cloud service. In this scenario, Bar will be the 
Hardware manufacturer; Cloud will be a Keystone platform 
provider and can be an RT programmer and SM programmer; 
Apache developers will be eapp programmer; Acme Corp. 
will be Keystone user, and; the person who uses the website 
will be the eapp user. 


3 Keystone Overview 


We designed and built Keystone on RISC-V. RISC-V is an 
open ISA with multiple open-source core implementations 
[19, 29]. It currently supports up to four privilege modes: U- 
mode (user) for user-space processes, S-mode (supervisor) for 
the kernel, H-mode (hypervisor) for the hypervisor, and M- 
mode (machine) which directly accesses physical resources 
(e.g., interrupts, memory, devices). At the time of writing, 
H-mode (hypervisor) is not included in the standard specifi- 
cation. Keystone will also be able to support hypervisor-level 
isolation when H-mode becomes available. 


3.1 Design Principles 


We design customizable TEEs with maximum degrees of 
freedom and minimum effort using the following principles. 


Leverage programmable layer and isolation primitives 
below the untrusted code. We deoin a reference monitor 
style security monitor (SM) t 1 ntee 

the platform using four properties of M- : (a) it GED 
grammable by platform providers, (b) it meets our needs for a 
peeo (c )i it controls hardware dele- 


M- modë s control of RISC V s rote ] 
(PMP) standard [17] enables ‘ablation of memory-mapped 
control features at runtime. 


The SMe 


ghest privilege. It has few non-security responsibilities. 
This pane Ww allows it to present clean ab- 
stractions. Our S-mode runtime (RT) and U-mode enclave 
application (eapp) both reside in enclave address space and 
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Figure 2. Keystone End-to-end Overview. ® Platform provider 
configures the SM. @ Keystone compiles and generates the SM boot 
image. ® Platform provider deploys the SM. @ Developer writes 
an eapp, configures the enclave. @ Keystone builds the binaries, 
computes measurements. © Untrusted host binary is deployed to 
the machine. @ Host deploys the RT, the eapp, and initiates the 
enclave creation. © Remote verifier can attest based on known 
platform specifications, keys, and SM/enclave measurements. 


——— oe eon : 


operations on behalf of the eapp (e.g., n). Each en- 
clave instance may choose its own RT which is never shared _ 


Design modular layers. Keystone uses modularity (SM, 
RT, eapp) to support a variety of workloads. It frees Key- 


stone platform providers and Keystone programmers from 


retrofitting their requirements and legacy applications into 
an existing TEE design (giao saan 


Allow fine-grained TCB configuration. Keystone can in- 


stantiate TEEs with the minimal TCB for given specific use- 
cases. The enclave programmer can further optimize the TCB 
via RT choice and eapp libraries using existing user/kernel 
privilege separation. For example, if the eapp does not need 
libc support or dynamic memory management, Keystone 
will not include them in the enclave. 


3.2 Keystone Enclave Workflow 


Figure 2 details the steps from Keystone provisioning to eapp 
deployment. The platform provider instantiates a SM with a 
proper hardware specification and security extenstions that 
bring additional isolation guarantees such as cache parti- 
tioning. Independently, the enclave developers use Keystone 


Keystone 


tools and libraries to write eapps and RT with rich features 
such as virtual memory management and system calls. The 
RT may use available SM SBI call, but they do not change 
the isolation guarantees that the SM enforces. 


3.3 Writing eapps 


Keystone supports 3 ways of writing enclave applications as: 
(a) standalone Keystone-native eapps, (b) un-modified RISC- 
iii or (c) p 

running selected parts in the enclave. ill allow 
Keystone to operate as a backend for cross-enclave SDKs (e.g., 
OpenEnclave [11], Asylo [67]) to allow for a wide variety of 
programming models. In sections 7.4, 7.3 we demonstrate 
un-modified RISC-V binaries and a manual partitioning. 


3.4 Threat Model 


The Keystone framework trusts the PMP specification as 


well as the PMP implementation to 
be bug-free. The only after 


verifying if the SM measurement is correct, signed by trusted 
hardware, and has the expected version. The SM only trusts 


the hardware, t the RT trusts the SM, 


Keystone can operate under diverse threat models, each 
requiring different defense mechanisms. For this reason, we 
outline all relevant attackers for Keystone. We allow the 
selection of a sub-set of these attackers based on the scenario. 
For example, if the user is deploying TEEs in their private 
data centers or home appliances, a physical attacker may 
not be a realistic threat and Keystone can be configured to 
operate without physical adversary protections. 


Attacker Models. Keystone protects the confidentiality and 
integrity of all enclave code and data at all times after cre- 
ation. We define four classes of attackers who aim to com- 
promise our security guarantees: 


. We assume that the phys- 
ical attacker n ip 
package. APhyc is for confidentiality, Appy, is for integrity. 
A software attacker Asw can control host applications, the 
apes launch adversarial 


the Tan agdidnominceleneepelouemeccsesees 

A side-channel attacker Asc can glean information by ob- 
‘usted 
hannel (ACache), the ti 


ae (7) 
l (ATime) or the controlled channel (Acniri)- 


eystone allows the OS to DoS enclaves 
as the OS can refuse services to user applications at any time 


Scope. Keystone currently has no meaningful mechanisms 
to protect against i s [27, 56]. Ex- 


A physical attacker Apņy can intercept, modify, or replay _ 


S 


isting and future defenses against this class of attacks gabez 
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Caller SM SBI Description 
create Validate, and measure the enclave 
run Start enclave and boot RT 

OS resume Resume enclave execution 
destroy Clean & release enclave memory 
stop Pause enclave execution 

RT exit Terminate the enclave 
attest Get a signed attestation report 
random Get secure random values 

OS & RT extension* Platform-specific functions 


Table 1. The SBI functions the SM provides, “SM can provide addi- 
tional functions (e.g., dynamic resizing) depending on the platform. 


retrofitted into Keystone [24, 91]. Keystone does not natively 
enclaves or t 
-n should use existing software solutions 
B, err [42] and hardware manufactur- 
ers can supply timing side-channel resistant hardware [54]. 
5 (SAREE E ER (e g., mem- 
ory bus [59]) are also out-of-scope | of this paper and they 
can be orthogonally miti . The SM 
exposes a limited API (i.e., SBI) to the host OS and the en- 


clave. We do not provide non- -interference guarantees for this 
API [41]. Similarly, 


system calls into the host OS. We assume that the RT and the _ 


ea: via 
this untrusted interface [31, 73, 82]. We e that the SM, 


RT, and eapp are bug-free. This is a strong assumption but 


can be partially achieved with formal verification [41, 68]. 


4 Keystone Security Monitor 


The core of a Keystone TEE is the Security Monitor (SM). 
As the SM uses only standard RISC-V features, it is easily 
portable to the other RISC-V platforms. In addition, Keystone 
provides an easy way of configuring and compiling the SM 
depending on the underlying platform. With this design, we 
show how Keystone integrates with optional hardware to 
provide additional security guarantees such as cache side- 
channel defenses without any application changes. By de- 
sign, 


E REES iht This allows for a ae ae 


low attack surface, highest-privilege component. 


4.1 Memory Isolation 
Keystone only requires the RISC-V hardware to provide sim- 
ple security primitives, AS 


pootlbadel SM) to safely validate their decisions. 
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Figure 3. How Keystone uses RISC-V PMP for the flexible, dynamic 
memory isolation. pmpaddr and pmpcfg control and status registers 
(CSRs) are used to specify PMP entries. The SM uses a few PMP 
entries to guard its own memory (SM) and enclave memories (E1, 
E2). Upon enclave entry, the SM will reconfigure the PMP such that 
the enclave can only access its own memory (E1) and the untrusted 
buffer (U1). 


Background: RISC-V Physical Memory Protection. Key- 
stone uses physical memory protection (PMP), a feature pro- 
vided by RISC-V. P 


it y Figure 3). Each PMP entry controls the U-mode 


and S-mode permissions to a customizable region of physi- 
cal memory.’ The PMP address registers encode the address 


of a contiguous physical region, configuration bits specify 
s for U/S-mode, and two addressing 


——_ or has t to support var- 
ious sizes of regions ( 

aligned regions). PMP entries are statically prioritized with 
the lower-numbered PMP entries taking priority over the 


. If U- or S-mode attempts to access 


a physical address and it d s 


Enforcing Memory Isolation via the SM. PMP makes Key- 
stone memory isolation enforcement flexible in three ways: 


(a) multiple discontiguous enclave memory regions can coex-- 


ot, Keystone c = 
on (code, 


stack, data such as enclave metadata and keys), 


1Currently processors have up to 16 M-mode configurable PMP entries. 
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quest, eS 
n. Since a Ber ae s PMP 
entry has a higher priority than the OS PMP entry (the last in 
Figure 3), the OS and other user processes cannot access the 
enclave region. A valid request requires that enclave regions 
not overlap with each other or with the SM region. 

During control-transfer to an enclave, the SM (for the 
current core only): (a) e 


clave. This allows the enclave to access its own memory and 


no other regions. Ame Biieonteztmitohtononzenelaye, 
paaa 


OS. Enclave ae re 


PMP Enforcement Across Cores. Each core has its own 


complete set of PMP entries. During enclave creation, PMP 
Peper ins err re ire ener 
eae (IPIs). The SM executing on each of the cores 
andles these IPIs by removing the access of other cores to- 
the enclave. During the enclave execution, changes to the - 
PMP entries (e.g., context switches between the enclave and 


the host) are local to the core executing it and need not be _ 
“propagated to the other cores. PMP synchronization IPIs are 


only sent during enclave creation and destruction. 


PMP CE NEE prec ERE ESTER 


used for cases like self-paging as described in Section 5.1. 


Naively, Keystone supports N — 2 simultaneously created 
enclaves, where N is the number of PMP entries available. 
Alternatively, with adjacent allocations by the OS, Keystone 
can virtualize the PMP entries at the cost of disallowing mem- 
ory reclamation until all latter enclaves are destroyed. Future — 


SM and RT features that support relocation may allow for 


complete virtualization of PMP entries 
Similarly, t e (H-mo al 


would allow for an additional layer of address translation to 
transparently virtualize PMP entries [7]. 


4.2 Post-creation In-enclave Page Management 


Keystone has a different memory management design from 
most TEEs (see Figure 4). It uses the OS-generated page ta- 
bles for initialization and then delegates virtual-to-physical 
memory mapping entirely to the enclave during execution. 


Since RISC-V provides per-hardware-thread views of the 


Our kernel driver uses both the Buddy Allocator and the Contiguous 
Memory Allocator (CMA) to dynamically allocate enclave memory with 
various sizes. 


Keystone 


me 


Normal Secure Security Monitor | | VMM [ge 
(a) Intel SGX (b) Komodo (c) Keystone (d) Xen 


Figure 4. Memory Management Designs (shaded area is untrusted). 
(a) Untrusted OS manages memory, translates virtual-to-physical 
address. (b) Page tables inside the enclave but monitor creates 
mappings. (c) Delegates page management to enclave with its own 
page table. (d) Hypervisor for page management, 2 page tables. 


isters, it allows Keystone to have multiple concurrent and 


physical memory partitions. With an isolated S-mode inside 


the enclave, Keystone can execute its own virtual memory 
management which manipulates the enclave-specific page 
tables. Page tables are always inside the isolated enclave 
memory space. B ng the men i 
a i 


. Exceptions (e.g. page faults, etc) may be 
safely delegated to the RT via the RISC-V exception delega- 
tion register. The RT tł ndles exceptions as needed to 


clave holding a core to DoS the host 
timer before it enters the enclave. When the SM regains con- 


trol after the timer interrupt triggers, it may return control 
to the host OS or request that the enclave cleanly exit. 


4.4 Enclave Lifecycle 


Keystone enclaves go through three distinct changes during 
their lifecycle. At creation, Keystone measures the enclave 
memory to ensure that the OS has loaded the enclave binaries 
correctly to the physical memory. Keystone uses the initial 
virtual memory layout for the measurement because the 
physical layout can legitimately vary (within limits) across 
different executions. For this, the SM expects the OS to initial- 
ize the enclave page tables and allocate physical memory for 
the enclave. The SM walks the OS-provided page table and 
checks if there are invalid mappings and ensures a unique 
virtual-to-physical mapping. The SM then hashes page con- 
tents along with the virtual addresses and the configuration 
data. At execution, the SM sets PMP entries and transfers 
control to the enclave entry point. On an OS initiated de- 
struction, the SM clears the enclave memory region before 


Enclaves may request a signed attestation from the SM 
_ ing runtime. Keystone uses a standard scheme to bind the 
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Figure 5. Memory Model for Various TEE Scenarios. Ø: baseline, C: 
cache partitioning, O: on-chip scratchpad, P: enclave self-paging, E: 
software memory encryption E yw: hardware memory encryption. 


returning the memory to the OS. SM cleans and frees all the 
enclave resources, PMP entries, and enclave metadata. 


4.5 TEE Primitives 


Keystone supports the following standard primitives. 


Secure Boot. A Keystone root-of-trust can be either a tamper- 
proof software (e.g., a zeroth-order bootloader) or hardware 
(e.g., crypto engine). At each CPU reset, the root-of-trust 
(a) rasapumeaithessMeimags. (b) ganetaiee 


i , and (d) signs the measurement and 
ee iene These stan- 
dard operations can be implemented in many ways [53, 58]. 
Keystone does not rely on a specific implementation. For 


completeness, currently, Keystone simulates secure boot via 
a modified first-stage bootloader for all the above steps. 


Secure Source of Randomness. Keystone provides a secure 
SM SBI call, random, which returns a 64-bit random value. 


Keystone uses a hardware source of randomness if available _ 


Caiana [66] *f applicable. 


Remote Attestation. The Keystone SM performs the mea- 
surement and the attestation based on the provisioned key. 
dur- 


attestation with a secure channel construction [41, 58] by 
including limited arbitrary data (e.g., Diffie-Hellman key 
parameters) in the signed attestation report. Key distribu- 
tion [15], revocation [46], attestation services [49], and anony- 
mous attestation [26] are orthogonal challenges. 


Other Primitives. Keystone can support other primitives, if 


required by the TEE: (a) it allows enclaves to access the read- 


ory. The 
ase [35], and 


sealed storage [15] with these features. 
4.6 Platform-Specific Extensions 


Keystone can leverage additional security and functionality _ 


fe re to provide stronger se- 
curity guarantees and/or additional features to the enclave 
at the cost of various trade-offs. We demonstrate several 
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examples of pe the SM för a specific panai so 


D. Lee et al. 


s. We t use the HiFive Freedom 
Unleashed [9] RISC-V dev board containing a Rocket-based 
quad-core SoC chip (FU540) with a proprietary L2 controller. 


Secure On-chip Memory. To protect the enclaves against a 
physical attacker who has access to the DRAM, we imple- 


mented an E (Figure 5(). It alaws 
Me chip paag On the F U540, we dynamically instanti- 


ate a scratchpad memory of up to 2MB via the L2 memory 
controller to generate a usable on-chip memory region. The 
scratchpad is then allocated exclusively to the requesting en- 
clave for it’s entire lifetime. An enclave requesting to run in 
the on-chip memory loads nearly identically to the standard 
procedure with the following changes: (a) the host loads the 
enclave to the OS allocated memory region with modified 
initial page tables referencing the final scratchpad address; 
and (b) the SM copies the standard enclave memory region 
into the new scratchpad region before the measurement. Any 
context switch to the enclave now results in an execution 
in the scratchpad memory. This uses only our basic enclave 
life-cycle hooks for the platform-specific features and does 
not require further modification of the SM. The only other 
change required was a modification of the untrusted enclave 
loading process to make it aware of the physical address 
region that the scratchpad occupies. No modifications to the 
Eyrie RT or the eapps are required. 


Cache Partitioning. Enclaves are vulnerable to cache side- 
channel attacks from the untrusted OS and other applicati 
via a shared cache. To this end, we implement a aep 


—i using two hardware capabilities: (a) tł 


cat [70]; (b) 


(Figure 5(b)). U] : 3 

EEDE. During the aaa execu- 
tion, only the cache lines from the enclave physical memory 
are in the partition and are thus protected by PMP. The ad- 
versary cannot insert cache lines in this partition during the 
enclave execution due to the line replacement way-masking 


mechanism. As a net effect, adversary (Acache) gains no 


residency size of the enclave’s cache. Ways are partitioned at 
runtime and are available to the host whenever the enclave 
is not executing even if paused. 


Dynamic Resizing. Statically pre-defined maximum en- 
clave size and subsequent static physical or virtual memory 
pre-allocations: (a) prevent the enclave from scaling dynami- 
cally based on workload, (b) complicates porting applications 
to eapps. To this end, Keystone allows the SM to dynamically 
change the physical memory boundaries of the enclave. The 


. If the OS succeeds in allocating, the SM increases 
the enclave’s size by extending the relevant PMP entries and 
notifies the RT, which then uses the free memory module to 
manage the new physical pages (see Section 5.1). 


5 Keystone Modular Runtime 


As the SM physically isolates each of the enclaves, we can 
(i.e., 
the RT). This enables modular system-level abstraction for 
eapps (e.g., virtual memory management). Although the RT 
is similar in functionality to a kernel inside an enclave, it does- 
not require most kernel functionality. We built a modular 
exemplar RT—Eyrie—to allow enclave developers the abiiy 
to include only necessary functionality and reduce ; 
Given the supervisor capability, we can cleanly m 
ment selected kernel functionality without modifying user 
applications. The additional privilege layer allows for further 
defensive design, such as only allowing the RT access to the 
shared memory buffer. Moreover, it enables eas ting 


a fu edged o 0 uch as sel4 
enclave. We introduce key Keystone RT modules and show 
how they support various workloads with small TCB. 


5.1 Enclave Memory Management Modules 


they have the privilegis to manage Ahero own , memory D 
need not cross the host-enclave isolation boundary. By de- 
fault, Keystone enclaves occupy a fixed contiguous physical 
memory allocated by the OS with a statically-mapped virtual 
address space at load time. While suitable for some embedded 
applications, it limits the memory usage of most legacy ap- 
plications. To this end, we describe several aptional modules 
to enable flexible r gement 
Free memory. We built a module that allows the Eyrie RT to — 
perform page table management, after the enclave reserves 
unmapped physical memory. Thus, the page mappings need 
not be pre-defined at creation time. The unmapped (hence, 
free) memory region is not included in the enclave mea- 
surement and is zeroed before beginning the eapp execution. 
The free-memory module is required for other more complex 
memory modules. 


In-Enclave Self Paging. We implemented a generic in-enc- 
lave page swapping module for the ees RT. It handles the 
enclave page-faults and uses a gene t 
module uses a aapke Bale eapp- only page eviction pol- 
icy. It works in conjunction with the free memory module for 
virtual memory management in the Eyrie RT. Put together, 
they help to alleviate the tight memory restrictions an en- 
clave may have due to the limited DRAM or the on-chip 
memory size [71-73]. 
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6.2 Protecting the Host OS 


OS, so an Asw in our case is stronger than in SGX. We 
ensure that the host OS is not susceptible to new attacks 
from the enclave because 


—— 00; () 


te -CausS tne IVi °TI ] 
(TLB, registers, L1-cache, etc. 
enclave and the OS; (d) DoS 


1 upted Dy 


6.3 Protection of the SM 
The SM naturall trusts all the lower-privilege software 
components (eapps, RTs, host OS, etc.). It is protected from 
an Asw because all the SM memory is isolated using PMP 
and is inaccessible to any enclave or the host OS. The SM SBI 
is another potential avenue of attack. Keystone’s SM presents 
a narrow, well-defined SBI to the S-mode code. It does not 


do complex resource management and is small enough to be 


formally verified [41, 68]. The SM is only a reference monitor, | 
it does not require scheduled execution time, so an Apos is 


not a concern. The SM can defend against an Acache and an 
ATime With known techniques [42, 54]. 


6.4 Protection Against Physical Attackers 


Keystone can protect against a physical adversary via plat- 
form features and a proposed modification to the bootloader. 
Similar to Chen et al. [33], the SM d to stor 


the decrypted code and data, 


pages, similar in concept to the SGX EPC. This fully guar- 
antees the confidentiality and integrity of the enclave code 
and data from an attacker with control of DRAM. 

The SM should be executed entirely from the on-chip 
a A SM is statically sized and has a relatively small 
in-memory footprint (< 150Kb). On the FU540, this would 
involve repurposing a portion of the L2 loosely-integrated 
memory (LIM) via a modified trusted bootloader. 


ntegrity protected (e.g., swapped enclave pages). Keystone 
accomplishes this with i 


With these techniques in place, content outside of the chip — 


ae ee 


D. Lee et al. 
Gare Cache Size Latency # of TLB 
Platform (KB) (cycles) Entries 
# Type L1-I/D L2 L1 L2 L1 L2 
Rocket-S 1 in-order 8/8 512 2 24 8 128 
Rocket 1 in-order 16/16 512 2 24 32 1024 
BOOM 1 OoO 32/32 2048 4 24 32 1024 
FU540 4 in-order 32/32 2048 2 12-15* 32 128 


Table 2. Hardware specification for each platform. L2 cache latency 
in FU540 (*) is based on estimation. 


7 Evaluation 

We aim to answer the following questions in our evaluation: 

(RQ1) Modularity. Is the Keystone framework viable in 
different configurations for real applications? 

(RQ2) TCB. What is the TCB of a Keystone-instantiated 
TEE in various deployment modes? 

(RQ3) Performance. How much overhead do simple Key- 
stone TEEs add to eapp execution time? 

(RQ4) Real-world Applications. Does Keystone provide 
expressiveness with minimal developer efforts for 
eapps? 


7.1 Implementation & Experimental Setup 


We implemented our SM on top of the Berkeley Boot Loader 
(bbl) [13]. It supports Canadai and other 
features. We implemented the initialization of the SM at boot 
as well as the SBI specified in Table 1. Platform-specific ex- 
tensions have been implemented with hooks in SBI functions. 
We simulated unavailable hardware primitives such as the 
random number generator and the root of trust. All modules 
in Sections 4 and 5 are available as compile-time options. 

We implemented the Eyrie RT from scratch in C. Memory 
encryption is done via software AES-128 [2] and integrity 
protection is partially implemented. We ported the seL4 
microkernel [55] to Keystone by modifying 290 LoC for 
boot, memory initialization, and interrupt handling. There 
is no inherent restriction to these two RTs, and we expect to 
add further options. 

Our host user-land interface for interactions with the en- 
claves is provided via a Linux kernel driver that creates 
a device endpoint (/dev/Keystone). The untrusted host OS 
(i.e., Linux) launches and manages the enclaves via SBI on 
behalf of the user, and also manages the enclave ownership 
and enclave-related OS resources. 

We provide several libraries (edge-calls, host-side syscall 
endpoints, attestation, etc.) in C and C++ for the host, the 
eapp, and interaction with the driver-provided Linux de- 
vice. Our provided tools generate the enclave measurements 
(hashes) without requiring RISC-V hardware, customize the 
Eyrie RT, and package the host application, eapps, and RT 
into a single binary. We have a complete top-level build so- 
lution to generate a bootable Linux image (based on the 


Keystone 


Protecting the Page Content Leaving the Enclave. When — 


V t of tl y (either 
an on- pi memory or the protected a of the DRAM). 
When these pages have to be copied out, their content needs 
to be protected. Thus, as part of the in-enclave page manage- 
ment, we implement a backing-store layer that can include 
Peepe low for the se- 
cure content to be paged out to the insecure storage (DRAM | 

). The protection can be done either in the 
software as a part of the Keystone RT (Figure 5(d)) or bya 
dedicated trusted hardware unit—a memory encryption en- 
gine (MEE) [44]—with the SM’s on-chip memory capability 
(Figure 5). Admittedly, this incurs significan 
[epee ene ARR TERED 
optimizations. The amount of available on-chip memory for 


enclave. Keystone design is agnostic to the specific integrity 
schemes and can reuse the existing mechanisms [65, 79]. 


5.2 Functionality Modules 


Next, we demonstrate various functionality modules in Eyrie. 


Edge Call Interface. The eapp cannot access the non-enclave ~ 
memory in Keystone. If it needs to read/write the data outside 


the enclave, the Eyrie RT performs edge calls on its behalf. 
Our edge call, which is functionally similar to RPC, consists 
of an index to a function implemented in the untrusted host 
application and the parameters to be passed to the function. 
Eyrie tunnels such a call safely to the untrusted host, copies 
the return values of the function back to the enclave, and 
sends them to the eapp. The copying mechanism requires _ 
Eyrie to have access to a buffer shared with the host. To 
enable this: (a) the OS allocates a shared buffer in the host 
m 

) the SM passes the address to the enclave so the 


RT may access this memory; (c) the SM uses a separate PMP 
entry to enable OS access to this shared buffer. All the edge 


calls have to pass through the Eyrie RT as the eapp does 
not have access to the shared memory virtual mappings. 
This module can be used to add support for syscalls, IPC, 

clave-enclave communication, and so on. As the current 
edge interface is a straight-forward shared memory region, 
it can easily PeCuse alternative methods for dispatching calls 
such as mailboxes or HotCalls [89]. 

We allow the proxying of syscalls from the eapp to the host 
application by re-using the edge call interface. The user host 
application then invokes the syscall on an untrusted OS on 
behalf of the eapp, collects the return values, and forwards 
them to the eapp. Keystone can utilize existing defenses 
to prevent Iago attacks [32] via this interface [31, 73, 82]. 
Keystone resolves appropriate calls as in-enclave syscalls 
(e.g., mmap, brk, getrandom). Such calls are handled in Eyrie 
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and invoke SM interfaces as needed (e.g. getrandom) before 
returning to the eapp. 


Multi-threading. We run multi-threaded eapps by delegat- 
ing the thread management to the runtime. We do not sup- 
port parallel multi-core enclave execution yet, but this can 


be implemented by allowing the SM to invoke enclave exe- 
cution multiple times in different cores. 


6 Security Analysis 


We argue the security of the enclave, the OS, and the SM 
based on the threat model outlined in Section 3.4. 


6.1 Protection of the Enclave 


Keystone attestation ensures that any modification of the SM, — 
_RT, and the eapp is visible while creating the enclave. During 
the enclave execution, any direct attempt by an Asw to access — 


ae (cached or uncached) is defeated by 
All enclave data structures can only be modified by 


both are isolated from direct access. 


the enclave or the SM, 
Subtle attacks such as controlled side channels (Acnt,1) are 


in Keystone as enclaves have dedicated page 
les. This ensures that 


any enclave executing with any Keystone instantiated TEE 
is always protected against the above attacks. 


Mapping Attacks. The 


mappings [45] and ensures that the mappings are valid. The 


RT initializes the page tables either during the enclave cre- 
ation or loads the pre-allocated (and SM validated) static 
mappings. During the enclave execution, the RT ensures 
that the layout is not corrupted while updating the map- 
pings (e.g., via mmap). 


pages, say via the dynamic memory resizing, the RT checks 
Similarly, if t. 

their content before returning them to the OS. _ 

Syscall Tampering Attacks. If the eapp and the RT invoke ~ 
untrusted functions implemented in the host process and/or 

and system call tampering attacks [32, 77]. Keystone can re- 

use the existing shielding systems [18, 31, 82] as RT modules 


to defend the enclave against these attacks. 


Side-channel Attacks. Keystone thwarts cache side-channel 
attacks (Section 4.6). Enclaves do not share any state with 
the host OS or the user application and hence are not ex- 
posed to controlled channel attacks. The SM performs a clean 
context switch and flushes the enclave state (e.g., TLB) 


mation leakage via the SM or the edge call API with known — 
“defenses [83, 84]. 


(e.g., interrupts, faults), these are not visible to the host OS. 


TH gia aisacles apps beaa abeaub aS Cope 


Keystone 


SM Component LoC Runtime Component LoC 


Base 1100 = 1800 
Edge-call Handling 30 || — 300 
Dynamic Memory 70 || — 100 
Memory Isolation 500 |} libc Environment 50 
Cache Partitioning 300 || In-enclave Paging 300 
Secure Boot 170 || Syscalls 450 
On-chip Memory 50 || Free Memory 300 

IO Syscall Proxying 300 


Table 3. TCB Breakdowns for the Eyrie RT and SM features in LoC. 


tooling for the HiFive Freedom Unleashed) for QEMU, FPGA 
softcores, and the HiFive containing our SM, the driver, and 
the enclave binaries. 

We used four different platforms for our experiments; the 
HiFive Freedom Unleashed [9] with a closed-source FU540 
(at 1GHz), and three open-source RISC-V processors: small 
Rocket (Rocket-S), default Rocket [19], and Berkeley Out-of- 
order Machine (BOOM) [29] (See Table 2). We instantiate the 
open-source processors on cloud FPGAs using FireSim [52] 
which simulates the cores at 1GHz. The host OS is build- 
root Linux (kernel 4.15). All performance evaluation was 
performed on the HiFive and the data is averaged over 10 
runs unless otherwise specified. 


7.2 Modularity & Support 


We outline the qualitative measurement of Keystone flex- 
ibility in extending features, reducing TCB, and using the 
platform features. Table 3 shows the TCB breakdown of var- 
ious components (required and optional) for the SM and 
Eyrie RT. Most of the modifications (e.g., additional edge- 
call features) require no changes to the SM, and the eapp 
programmer may enable them as needed. Future additions 
(e.g., ports of interface shields) may be implemented exclu- 
sively in the RT. We also add support for a new RT by porting 
seL4 to Keystone and use it to execute various eapps (See 
Section 7.4). Keystone passes all the tests in seL4 suite and 
incurs less than 1% overhead on average over all test cases. 
The advantage of an easily modifiable SM layer is noticeable 
when features require interaction with the core TEE prim- 
itives like memory isolation. The SM features were able to 
take advantage of the L2 cache controller on the FU540 to 
offer additional security protections (cache-partitioning and 
on-chip isolation) without changes to the RT or eapp. 


TCB Breakdown. Keystone comprises of the M-mode com- 
ponents (bb1 and SM), the RT, the untrusted host application, 
the eapp, and the helper libraries, of which only a fraction 
is in the TCB. The M-mode component is 10.7 KLoC: a 
cryptographic library (4 KLoC), pre-existing trap handling, 
boot, and utilities (4.7 KLoC), the baseline SM (1.6 KLoC), 
and platform-specific code for FU540 (400 LoC). A minimum 
Eyrie RT is 1.8 KLoC, with modules adding further code as 
shown in Table 3 up to a maximum Eyrie RT TCB of 3.6 KLoC. 
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Figure 6. Breakdown of operations during the enclave life-cycle. 
(a) shows enclave validation and hashing duration, and (b) shows 
the breakdown of other operations. (b) does not include duration of 
size-dependent operations such as measurement in create (Shown 
in (a)) and memory cleaning in destroy (4K-11K cycles/page). 


The current maximum TCB for an eapp running on our SM 
and Eyrie RT is thus a total of 15 KLoC. TCB calculations 
were made using cloc [8] and unifdef [14]. 


7.3 Benchmarks 


We use 4 standard benchmark suites with a mix of CPU, 
memory, and file I/O for system-wide analysis: Beebs, Core- 
Mark, RV8, and IOZone. We report the overheads of the 
cache partitioning and physical attacker protection with RV8 
as an example of Keystone trade-offs. In all the graphs, ‘other’ 
refers to the lifecycle costs for enclave creation, destruction, 
etc. All benchmarks are run as unmodified RISC-V binaries 
using an Eyrie runtime with relevant modules as needed. 
Common Operations. Figure 6 shows the breakdown of var- 
ious enclave operations. Initial validation and measurement 
dominate the startup with 2M and 7M cycles/page for FU540 
and Rocket-S due to an unoptimized software implementa- 
tion of SHA-3 [4]. The remaining enclave creation time totals 
20k-30k cycles. Similarly, the attestation is dominated by the 
ed25519 [6] signing software implementation (not shown 
in the graph, 0.7M-1.6M cycles). These are both one-time 
costs per-enclave and can be substantially optimized in soft- 
ware or hardware. The most common SM operation, context 
switches, currently take between 1.8K(FU540)-2.6K(Rocket- 
S) cycles depending on the platform. Notably, creation and 
destruction of enclaves takes long on the FU540 (4-core) due 
to the multi-core PMP synchronization. 


Standard Benchmarks as Unmodified eapp Binaries. 
Beebs, CoreMark and RV8. As expected, Reystone i incurs 


IOZone. All the target files are located on the untrusted host 
and we tunnel the I/O syscalls to the host application. Fig- 
ure 7 shows the throughput plots of common file- content ac- 
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Figure 7. IOZone throughput in Keystone for various file and 
record sizes (e.g., r8 represents 8KB record). We only show write 
and read results due to limited space. 
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Figure 8. Full-execution time comparison for RV8. Each bar shows 
the duration of the application (user or eapp), and the other over- 
heads (other). Keystone (keyst) and Keystone with cache parti- 
tioning (keyst-cache) compared to native execution (base). 


via the untrusted buffer, (b) each call requires the RT to go 
oe 


f ers, incurring an additional throughput loss on re- 
write (avg. 38.0%), re-read (avg. 41.3%), and record re-write 
(avg. 55.1%) operations. Since (b) is a fixed cost per system 
call, it increases the overhead for the smaller record sizes. 


Cache Partitioning. The mix of pure-CPU and large work- 
ing-set benchmarks in RV8 are ideal to evaluate the impact 
offaceh partitioning. We granted 8 of the 16 ways in the L2 
cache to the enclave during execution (see Figure 8). Small 


miniz, aes) sho O p erhead due to a sm 
tive cache. Enclave initialization Eney is unaffected. 
Physical Attacker Protections. We ran the RV8 suite with 
on-chip execution, enclave self-paging, page encryption, and 
a DRAM backing page store (Table 4). A few eapps (sha512, 
dhrystone), which fit in the 1MB on-chip memory, incur no 
overhead and are Da even Hom APhy: Taniere 


=O O O For example, primes 


D. Lee et al. 
Overhead (%) # of Page 
Benchmark Ø Cc O,P O,P,E Faults 
primes -0.9 40.5 65475.5 * 66 x 10° 
miniz 0.1 128.5 80.2 615.5 18341 
aes = 66.3 1471.0 4552.7 59716 
bigint -0.1 1.6 0.4 12.0 168 
qsort -2.8 -1.3 12446.3 26832.3 285147 
sha512 -0.1 0.3 -0.1 -0.2 0 
norx 0.1 0.9 2590.1 7966.4 58834 


dhrystone -0.2 0.3 -0.2 0.2 0 


Table 4. RV8 Overhead for different TEE design instances. Ø: base- 
line, C: cache partitioning, O: on-chip scratch pad execution (1MB), 
P: enclave self-paging, E: software-based memory encryption. *: 
does not complete in ~10 hrs. 


incurs the largest amount of page faults because it allocates 


‘encryption adds 2 — 4x more overhead to page faults. These 
overheads can be alleviated by the Keystone framework if 
a larger on-chip memory or dedicated hardware memory 
encryption engine is available as we discussed in Section 5. 


7.4 Case Studies 


We demonstrate how Keystone can be adapted for a varied 
set of devices, workloads, and application complexities with 
three case-studies: (a) machine learning workloads for the 
client and server-side usage, (b) machine learning for var- 
ied RTs, (c) a small secure computation application written 
natively for Keystone. The evaluation for these case-studies 
was performed on the HiFive board. We used the unmodified 
application code logic, hard-coded all the configurations and 
arguments for simplicity, and statically linked the binaries 
against glibc or musl libc supported by the Eyrie RT. We 
ported the widely used cryptographic library libsodium to 
both Eyrie and seL4 RT trivially. 


Case-study 1: Secure ML Inference with Torch and Eyrie. 
We ran nine Torch-based models of increasing sizes with 
Eyrie on the Imagenet dataset [39] (see Table 5). They com- 
prise 15.7 and 15.4KLoC of TH [3] and THNN [5] libraries 
from Torch compiled with mus1 libc. Each model has an ad- 
ditional 230 to 13.4 KLoC of model-specific inference code [88]. 
We performed two sets of experiments: (a) execute the model 
inference code with static maximum enclave size; (b) with dy- 
namic resizing support to allow the enclave size to increase 
on-demand. Figure 9 shows the performance overheads for 
both configurations and non-enclaved execution baseline. 


Dynamic resizing 
reduces the initialization latency by 2.9% on average as the 
RT does not map free memory during enclave creation. 


dynamic resizing. The causes of this are: (a) Keystone loads 


Keystone 
Model # of # of App Binary Memory 
Layers Param LOC Size Usage 
Wideresnet 93 36.5M 1625 140MB 384MB 
Resnext29 102 34.5M 1910 123MB 394MB 
Inceptionv3 313 27.2M 5359 92MB 475MB 
Resnet50 176 25.6M 3094 98MB 424MB 
Densenet 910 8.1M 13399 32MB 570MB 
VGG19 55 20.0M 1088 77MB 165MB 
Resnet110 552 1.7M 9528 7MB 87MB 
Squeezenet 65 1.2M 914 5MB 52MB 
LeNet 12 62K 230 0.4MB 2MB 


Table 5. Torch model specification, workload characteristics, binary 
object size, and total enclave memory usage. 
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Figure 9. Inferencing time for various Torch models. Each bar 
consists of the duration of the application (user or eapp), and the 
other overheads (other). Keystone (keyst) and Keystone with the 
dynamic resizing (keyst-dyn) compared to native execution in 
(base). 


aai, ee any page taolis dor zaro iiiar EEE 
or similar behavior, so smaller sized networks like LeNet 
execute faster in Keystone and (b) the overhead is primarily 
proportional to the number of layers in the network, as more 
layers results in more memory allocations and increase the 
number of mmap and brk syscalls. We used a small hand- 
coded test to verify that Eyrie RT’s custom mmap is slower 
than the baseline kernel and incurs overheads. Densenet, 
which has the maximum number of layers (910), thus suffers 
from larger performance degradation. In summary, for long- 
running eapps, Keystone incurs a fixed one-time startup cost 
and the dynamic resizing is indeed useful for larger eapps. 


Case-study 2: Secure ML with FANN and seL4. Keystone 
can be used for small devices such as IoT sensors and cameras 
to train models locally as well as flag events with model 
inference. We ran FANN, a minimal (8KLoC C/C++) eapp 
for embedded devices with the seL4 RT to train and test a 
simple XOR network. The end-to-end execution overhead is 
0.36% over running in seL4 without Keystone. 


Case-study 3: Secure Remote Computation. We imple- 
mented a secure server eapp (and remote client) to count 
words in an input message using the Eyrie and baseline SM. It 
performs attestation, uses 1ibsodium to bind a secure chan- 
nel to the attestation report, then polls the host for encrypted 
messages using edge-calls, processes them inside the enclave, 
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and returns an encrypted reply to be sent to the client. The 
eapp has secure channel code (60 LoC), the edge-wrapping 
interface (45 LoC), and other logic (60 LoC). The host is 270 
LoC and the remote client is 280 LoC. Keystone takes 45K cy- 
cles for a round-trip with an empty message, secure channel, 
and message passing overheads. It takes 47K cycles between 
the host getting a message and the enclave notifying the host 
to send a reply. 


8 Related Work 


Here, we survey TEEs and design trade-offs that have been 
explored in existing works. 


TEE Architectures & Extensions. Three TEEs are closely 
related to Keystone: (a) Intel Software Guard Extension (SGX) 
executes user-level code in an isolated virtual address space 
backed by encrypted RAM pages [64]; (b) ARM TrustZone 
divides the memory into two worlds (i.e., normal vs. secure) 
to run applications in protected memory [1]; and (c) Sanctum 
uses a machine-mode SM, the memory management unit 
(MMU), and cache partitioning to isolate enclave memory 
and prevent controlled-channel and cache side-channel at- 
tacks [36]. Several other TEEs explore design at layers such 
as hypervisors [34, 45, 61], physical memory [30, 57, 62], vir- 
tual memory [25, 37, 80], and process isolation [38, 76, 86, 87]. 
Interested readers can refer to Appendix A for a summary 
of TEE design choices. 


Re-purposing Existing TEEs for Modularity. One way to 
meet Keystone’s design goal of customizable TEEs is to reuse 
the TEE solutions that are available on commodity CPUs. For 
each TEE, it is possible to enable a subset of programming 
constructs (e.g., threading, dynamic loading of binaries) by 
including a software management component inside the en- 
clave [12, 22, 31]. Alternatively, adding hardware extensions 
which are specifically designed and implemented for adding 
TEE capabilities requires lot of efforts [36, 71]. Another ap- 
proach is to simulate the programmable layer, say with a 
trusted hypervisor layer, which then executes an untrusted 
OS, but potentially inflates the TCB. 


Differences from Trusted Hypervisor Keystone executes 
the enclave logic in the supervisor mode (RT) and the user 
mode (eapp), while the machine mode code (SM) ec 
checks and enforces isolation Deundenk: 


ctly. Thus, Keystone SM is more anal- 
ogous to a reference monitor [16, 78]. 

TEE Support. Several works enhance existing TEEs. At the 
SM layer they optimize program-critical tasks [21, 36, 80]. 
At the hypervisor layer they add support for multiplexing 
the secure isolation enforced by hardware or use nested 
virtualization for isolation [23, 37, 47]. At the RT layer, they 
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target portability, functionality, security [11, 12, 18, 22, 31, 67, 
81]. At the eapp layer they reduce the developer efforts [20]. 
Although these systems are a fixed configuration in the TEE 
design space, they provide valuable lessons for Keystone 
features and optimization. 


Enhancing the Security of TEEs. Better and secure TEE 
design has been a long-standing goal, with advocacy for 
security-by-design [48, 75]. We point out that Keystone is 
not vulnerable to a large class of side-channel attacks [28, 90] 
by design, while speculative execution attacks [27, 56] are 
limited to out-of-order RISC-V cores (e.g., BOOM) and do not 
affect most SOC implementations (e.g., Rocket). Keystone 
can re-use known cache side-channel defenses [24, 54] as 
we demonstrated in Section 4.6. Lastly, Keystone can benefit 
from various RISC-V proposals underway to secure IO opera- 
tions with PMP [74]. Thus, Keystone either eliminates classes 
of attacks or allows integration with existing techniques. 


Formally Verified Hardware & Software. - - 


tT nt. A careful and ground- 
up design with verified components [43, 55, 69] may provide 
stronger guarantees and Keystone can help explore designs 
which combine these with hardware protection [41, 85]. 


Resemblance with traditional kernel designs. Despite be- 
ing designed for the TEE threat model, Keystone borrows 
and builds on well-known principles from a long line of work 
in OS design. Specifically, our choice of separating isolation 
(SM) and functionality (RT) has been explored mainly in 
micro-kernels [60]. Further, like many other works, our SM 
is inspired by the concept of reference monitors [16, 78]. 
Lastly, the modularity of abstraction between the host OS, 
the RT and eapp is similar to exokernels [40]. 


9 Conclusion 


We present Keystone, the first framework for customizable 
TEEs. With our modular design, we showcase the use of Key- 
stone for several standard benchmarks and applications on 
illustrative RTs and various deployment platforms. Keystone 
serves as a framework for both TEE research and future 
deployment of novel TEE designs. 
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Availability 


The Keystone implementation for all platforms (QEMU, Fire- 
Sim, and HiFive Freedom Unleashed) is available at https: 
//github.com/keystone-enclave/keystone. The modified seL4 
runtime is available at https://github.com/keystone-enclave/ 
keystone-seL4. General information and documentation is 
available at https://keystone-enclave.org. 


A Trade-offs in existing TEEs 


Table 6 shows the trade-offs in the existing TEE or TEE-based 
systems. 
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Table 6. Trade-offs in existing TEEs/extensions. Oo. Q, Q: best to worst 
respectively. C3-6: resilience to software adversary, hardware adversary, 
side-channel adversary, controlled-channel adversary respectively. indicates 
complete protection; confidentiality only; no protection. C7: zero; thousands 
LoC; millions LoC. C8: zero; non-zero hardware; micro-architectural modi- 
fications. C9: enclave self resource management; partial; no flexibility. C10: 
range of apps supported are maximum; specific class; only written from 
scratch. C11: expressiveness includes forking, multi-threading, syscalls, 
shared memory; partial; none of these. C12: dev-effort for porting is un- 
modified binaries; compiling and/or configuration files; re-writing. 
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